A labeled dataset for building HVAC systems operating in faulted and fault-free states

Open data is fueling innovation across many fields. In the domain of building science, datasets that can be used to inform the development of operational applications - for example new control algorithms and performance analysis methods - are extremely difficult to come by. This article summarizes the development and content of the largest known public dataset of building system operations in faulted and fault free states. It covers the most common HVAC systems and configurations in commercial buildings, across a range of climates, fault types, and fault severities. The time series points that are contained in the dataset include measurements that are commonly encountered in existing buildings as well as some that are less typical. Simulation tools, experimental test facilities, and in-situ field operation were used to generate the data. To inform more data-hungry algorithms, most of the simulated data cover a year of operation for each fault-severity combination. The data set is a significant expansion of that first published by the lead authors in 2020.


Background & Summary
Fault detection and diagnostics (FDD) is a well-established field of study in building science and building technology applications. This is largely driven by the significant impact of equipment faults and control problems on building energy use and emissions, equipment life, and occupant comfort. Building HVAC systems in particular, afford a rich opportunity space for FDD algorithm development, given the multiplicity of system configurations, complex operations, and availability of monitored data. In addition, the recent push to decarbonize buildings and the electricity sector is increasing the importance of grid-interactive efficient buildings that can reliably provide load-flexibility services to the renewable supplied grid. This is making it even more critical to ensure that building HVAC systems are controllable and fault-free, providing further motivation for FDD technology development and deployment.
In buildings, FDD software tools employ operational data collected from building automation systems, sensors, and meters, to automatically detect equipment and control problems, or degrading performance in an HVAC system, and to diagnose potential root causes 1 . Using the results from FDD technologies, building operators can efficiently direct maintenance activities to address inefficiencies, or equipment and control malfunctions.
In the past thirty years, a large body of literature has been published documenting the development and application of FDD solutions for buildings. The active research covers a breadth of topics including: (1) the development and validation of hundreds of FDD methods 2-4 ; (2) the development of experimental platforms or simulation software tools to generate fault inclusive models [5][6][7] , and the development of fault-inclusive data sets 8-10 ; (3) quantification of the prevalence and occurrence rates of faults in buildings [11][12][13] ; (4) analysis of the impact of faults on system operations 14,15 , energy consumption 16,17 , equipment maintenance and operational costs 18,19 , occupant thermal comfort 15,20,21 , and indoor air quality 22 ; (5) FDD technology application, costs, and benefits, in existing buildings 1,23 ; (6) FDD algorithm performance testing methodologies 24,25 ; and (7) automated fault correction 26,27 and maintenance activities 28 after faults are diagnosed and flagged by FDD tools.
Although building control and automation systems are able to store and export large volumes of operational data, these data are often prone to data quality issues including erroneous sensors and gaps. Consistent naming 1 Lawrence Berkeley National Laboratory, Berkeley, USA. 2 Drexel University, Philadelphia, USA. 3 Oak Ridge National Laboratory, Oak Ridge, USA. 4 National Renewable Energy Laboratory, Golden, USA. ✉ e-mail: JGranderson@lbl.gov; YiminChen@lbl.gov Data DeSCrIPtor oPeN conventions are not used from one system to another, and semantic metadata to interpret the meaning and relationships between data are rarely used. A further complication is that the data reflect unknown and unlabeled presence of a wide variety of commonly occurring faults. Finally, while small collections of field data may be acquired by researchers, it is extremely difficult to amass a large-scale dataset that represents climate, HVAC system, and operational diversity. This presents tremendous barriers for innovation in FDD algorithm development, and performance evaluation.
Extending the body of work focused on FDD algorithm testing methods and test datasets, this paper documents a significant expansion of the HVAC fault dataset presented in 9 . The expansion incorporates five new HVAC systems and configurations, an increased number of fault cases, and more extensive time spans for each fault-intensity combination, (in most cases reaching a full 365 days). The data were produced using simulation tools, laboratory experimental facilities, and field tests. Additionally, a semantic model for each system has been developed according to the Brick schema 29 for improved usability and conformance with today's commonly used building industry metadata schema.
The expanded dataset documented in this article includes seven common HVAC systems: the single duct air handling unit (AHU) system, the packaged rooftop unit (RTU), the dual duct AHU system, the fan coil unit (FCU) system, the variable air volume fan power unit (FPU), the boiler plant, and the chiller plant. 257 fault cases are represented, spanning sensor-related faults, actuator-related faults, control faults (e.g., controller PID parameter settings), and component faults (e.g., cooling coil foiling fault). In total, that dataset comprises 8 billion data samples, and represents the largest known ground truth-verified data for HVAC faults. As noted in the 2020 publication 9 FDD researchers and developers can use the data to: • Develop, evaluate, and compare performance across FDD algorithms; • Identify performance gaps to focus future development efforts and resource investment; • Develop an understanding of how FDD technology overall is improving over time; and • Enable a better understanding of HVAC system performance under faulted and fault-free operation conditions for educational purposes.
Prior work such as ASHRAE research projects RP-1312 and RP-1043, and National Institute of Standards and Technology (NIST) 10D243 project, represent early contributions of operational HVAC fault data. This research advances those early efforts by increasing the number and type of HVAC systems that are represented, by increasing the duration of fault-free and faulted operational span (one year in most cases), and by increasing the number and type of faults that are represented. This will significantly increase the usability of the dataset for FDD algorithm development and performance evaluation.

Methods
The newly expanded dataset contains experimental and simulated data across the seven HVAC systems types and configurations that are represented -the majority being simulated. Diverse facilities and simulation tools were used to create the data, and methods to impose the faults were created for each fault, given the specific HVAC system of focus, the control sequences that defined its operation. These facilities and tools, HVAC system details, and fault methods are described in the following, as is the metadata schema that was applied to the data. Provision of the metadata enables ease of interpretation of the data, and supports users of the dataset who wish to employ more automated procedures to interface FDD algorithm instances with the data.
Facilities and simulation tools. The simulated datasets were created using HVACSIM+ and an EnergyPlus-Modelica co-simulation. HVACSIM + was developed by the US NIST 30 , the Modelica Buildings Library 31 is developed by the Lawrence Berkeley National Laboratory, and EnergyPlus 32 is developed by several contributors through funding from the US Department of Energy. Described with respect to other modeling tools in 33 , HVACSIM+, Modelica, and EnergyPlus are non-proprietary tools to model the behavior of building HVAC systems using physics-based approaches. In addition, Modelon's air conditioning library was used to model the refrigerant side faults in the RTU system 34 . This library provides ready-to-use refrigeration cycle templates and a wide range of components to create a variety of air conditioning system configurations.
Four experimental research facilities were used to create data and to develop and validate simulation models: 1. FLEXLAB located at the Lawrence Berkeley National Laboratory in Berkeley, California, for the generation of the single-zone CAV data set and the variable-air-volume (VAV) AHU data set 9 . 2. The Flexible Research Platform (FRP) located at the Oak Ridge National Laboratory in Oak Ridge Tennessee, for the generation of RTU data sets 9 . 3. The Energy Resource Station facility was previously located at the Iowa Energy Center in Ames City, Iowa, for the development and validation of DD-AHU, FCU and FPU simulation models, and for creation of multi-zone VAV AHU data 35 . 4. The RTU facility is located in the Thermal Technology Facility (TTF) at the National Renewable Energy Laboratory in Golden, Colorado, for the validation of the RTU simulation model. NREL's TTL is a flexible multipurpose laboratory that enables detailed evaluation and development of building and thermal energy systems. The TTF research space reaches 11,000 sq.ft. Two RTUs-a 5-ton/SEER 17 (RTU 1) and a 6-ton/IEER 23 (RTU 2) are installed in the TTL to develop comprehensive performance maps suitable for use with whole-building energy simulation computer programs. The SEER 17 contained a two-stage scroll compressor with R-410A, single-speed condenser fan, direct-drive variable-supply air fan with a high-efficiency motor, low leak dampers, hot gas reheat humidity control, and an economizer. The IEER 23 www.nature.com/scientificdata www.nature.com/scientificdata/ contained a variable-speed direct-drive compressor, variable-speed fans, and control logic that maintained the compressor and thermal expansion valve (TXV) within their performance limitations 36 .
Field data representing faulted and un-faulted rooftop unit operation is also included in the dataset. This data was collected from two RTUs, one in a restaurant building in Milford, CT and another one in a distribution center building in Colchester, CT. Table 1 summarizes these sites and the RTUs.
System configurations and control sequences. The configurations and sequences for each system in the data set are comprehensively documented for users of the data in an inventory file. This information is often needed to specify controls-specific parameters in fault detection and diagnostic algorithms. To illustrate the form and content of this information, two examples are presented -the fan coil unit system, and the boiler plant.
Fan coil unit. Figure 1 contains the schematic representation of the fan coil unit (FCU) system.
The FCU is scheduled for automatic operation on a time of day basis for occupied and unoccupied mode.
Occupied mode (Monday -Friday 6:00AM-17:59PM) During these hours, the system is in Operate Mode. Five control sequences -control, outdoor air damper control, cooling coil valve control, heating coil valve control sequence, and zone temperature setpoints -were set during the simulation.

• Fan control
• 3-speed fan with "Automatic On/Off " (Auto) mode: the fan on/off and speed change is based on the cooling proportional-integral-derivative (PID) output and heating PID output. The 10% dead band is given at each speed switchover level.  www.nature.com/scientificdata www.nature.com/scientificdata/ • Low speed condition: the PID outputs (the cooling/heating coil valve position) are higher than 0% and lower than 40%; • Medium speed condition: the PID outputs (the cooling/heating coil valve position) are > = 40% and <80%; • High speed condition: the PID outputs (the cooling/heating coil valve position) are > = 80% and < 100%; • Off: no heating or cooling demand.
• OA damper control • The OA damper maintains a minimum damper position at 30%. • Cooling coil valve control sequence • The PID control is used to adjust the cooling coil valve position. The setpoint dead band is 1 °F. If the actual room temperature is beyond 1 °F of the cooling setpoint, the FCU is in the "cooling" mode, and the cooling coil valve PID loop is enabled and the cooling valve position will be controlled by the cooling coil valve controller PID output. When the room temperature falls below 1 °F compared to the cooling setpoint, the cooling PID is disabled and the valve fully closed. • Heating coil valve control sequence • The PID control is used to adjust the heating coil valve position. The setpoint dead band is 1 °F. If the actual room temperature is beyond 1 °F of the heating setpoint, the FCU is in the "heating" mode, and the heating coil valve PID loop is enabled and the heating valve position will be controlled by the heating coil valve controller PID output. When the room temperature falls below 1 °F compared to the heating setpoint, the heating PID is disabled and the valve fully closed. • Zone temperature setpoints • Zone cooling setpoint: 72 °F; • Zone heating setpoint: 68 °F. • Shutdown mode • The shutdown mode is only triggered by the low temperature protection described below. Under the shutdown mode, the fan is constantly off, and the OA damper is fully closed.

• Low Temperature Protection
• During the simulation, when the mixed air temperature is below 35 °F and persists for 300 seconds, the FCU system will switch to the shutdown mode to prevent freezing the coil. The shutdown mode will last until the end of the day. The system will be turned back to normal operation at the beginning of the next day. Unoccupied mode During these hours, the system is in Setback Mode. The operation is similar to the operation mode except two additional settings as: • Outdoor air damper: The OA damper is fully closed • Zone temperature setpoints • Zone cooling setpoint: 85 °F; • Zone heating setpoint: 55 °F.
Boiler plant. Figure 2 illustrates the configuration of the boiler plant system. This system has two identical boilers and two hot water pumps and provides hot water to heating coils in the air-side system. The boiler plant system is controlled by two supervisory controllers and two local controllers (Table 2). One supervisory controller determines the number of the operating boilers using a state machine and the calculated heat load, as shown in Fig. 3. The heating load is calculated from: where v hw° is the volumetric flow rate of the hot water, T hw ent and T hw lea are the temperature of the hot water entering and leaving the boiler plant system, respectively. The other supervisory controller determines the number of operating hot water pumps, as shown in Fig. 4. Tables 3-10 summarize fault profiles and how each fault was imposed for each of the systems and fault scenarios. For the simulated datasets, each fault type and intensity were imposed for a full calendar year of operation -the exception being the simulated RTU dataset that covered a 100-day cooling season. For the experimental and field test datasets, fault type-intensity combinations were captured for one to 183 days of operation.

Fault scenarios and methods of fault imposition.
The RTU dataset that was acquired from field measurements reflected a naturally occurring compressor staging fault and a refrigerant undercharging fault.
Method of Brick schema model development. The Brick schema 29 offers classes and subclasses, of which the equipment class was used to designate the HVAC system components represented in the fault dataset. Similarly, the point subclass was used to design sensor measurement and control system data points. In addition, the schema offers 'relationships' , of which hasPart, hasPoint, and feeds, are relevant to describing the fault dataset. Figure 5 illustrates the 5-step process that was used to generate the Brick models for each HVAC system in the dataset. Among them, Step 4 is automated while the other steps are performed manually. www.nature.com/scientificdata www.nature.com/scientificdata/ Step 1: Conceptualization of Brick relationships using mechanical drawing or schematic. The schematic representations for each system were reviewed to identify the major components for the overall system, to develop compositional ("hasPart") relationships. For each major component, we identify all of the associated sensor/ control points to develop "hasPoint" relationships. Lastly, we identify the order in which the given media (air, water, etc.) flow through the system to develop sequential ("feeds") relationships between different equipment.
Step 2: Creation of hierarchical diagram to visualize Brick relationships. After identifying the components and sensor/control points of the system in Step 1, we indicate which equipment has which components ("hasPart"), which equipment or component has which sensor and control data points ("hasPoint"), and which equipment feeds into another equipment ("feeds").
Step 3: Mapping system components to Brick classes. All equipment, components, and sensor/control data points in the hierarchical diagram are mapped to a Brick schema class and tabulated. The equipment and the sub components are mapped to a subclass of the Brick "equipment" class (e.g., chiller, AHU, and RTU) and the sensors and the control points will be assigned a type subclass of Brick "point".
For each row (i.e., each component), we designate the relevant relationships, other components it connected to and these components. This way, we are able to incorporate all the components, their types and how they are related to other components.
Step 4: Execution of script to create a .ttl file. The tables generated in Step 3 are exported as CSV files and imported to a Python script that generates a Brick model in the form of a machine-readable .ttl file. The script iterates through each row of the table, assigning all components and points to a specific instantiation of a Brick class and corresponding relationships. The .ttl file can be accessed by an FDD algorithm (or other applications), enabling more efficient and standardized retrieval of system metadata using SPARQL queries. This streamlines the interpretation of data semantics within the FDD or other applications.
Step 5: Visualization of the Brick model to validate accuracy. The generated Brick model is verified by visualizing it and comparing it to the hierarchical diagram in Step 2. We used Brick Studio for the visualization and ensured that all the components in the data sets were present and the relationships between them were labeled correctly. Heating power of operating boilers The heating power of each operating boiler is controlled by a feedback loop to maintain the temperature of the water leaving each boiler to be a predefined value (176°F).

Speeds of operating hot water pumps
Hot water pump speed is controlled by a feedback loop to maintain the pressure difference in the hot water loop to be 17.5 psi. If two hot water pumps are running, both pumps operate at the same speed.

Data records
The data are stored on figshare 37 and on an LBNL website 10 . The description for the expanded seven data sets can be found in Table 11. For each system, the FDD data are stored in individual comma separated value (CSV) files, and each file contains one fault type under one fault intensity. The data are stored at the 1-minute interval rate to reflect system operations. The 1-minute interval rate can be re-sampled to a 5-minute interval and a 15-minute    www.nature.com/scientificdata www.nature.com/scientificdata/ interval, which are also commonly used in the existing building automation system (BAS). Time stamps are in the first column of each file, and presented in the format of "yyyymmdd hh:mm".   www.nature.com/scientificdata www.nature.com/scientificdata/ Each system dataset is accompanied with a .ttl Brick model and also a data 'inventory' file that describes the key information necessary to understand the content and scope of each data set, including:  Table 6. Methods of fault imposition for the VAV fan power unit dataset.

Method of Fault Imposition Fault Type Fault Intensity
Chiller 1: the chilled water leaving temperature sensor bias −2 °C, −1 °C, 1 °C, and 2 °C Add a bias to the sensor output Cooling tower 1: the condenser water leaving temperature sensor bias In the secondary chilled water loop: the differential pressure sensor bias −20%, −10%, 10%, and 20% The condenser water leaving the three-way valve leakage 25%, 50%, and 75% Increase the default minimum position setting The condenser water leaving the three-way valve stuck 50% and 75% Assign a fixed device position   Table 9. Methods of fault imposition for the experimental RTU dataset. *The correct economizer setpoint is 10 °C (50 °F) **Faults were imposed in Fall 2020, Spring 2021, Summer 2021, and Winter 2022

Method of Fault Imposition Fault Type Fault Intensity
In the boiler 1, the hot water leaving temperature sensor bias −2 °C, −1 °C, 1 °C, and 2 °C Add a bias to the sensor output In the hot water loop, the hot water leaving temperature sensor bias In the hot water loop, the differential pressure sensor bias −20%, −10%, 10%, and 20% In the boiler 1, the heat exchanger fouling 65%, 80%, and 95%, Multiply the intensity by the heat transfer coefficient Controller PI for boiler supply temperature setpoint is inappropriate tuning NA Modify the gain value of controllers  Granderson et al. 9 documented that the validity of the dataset can be assessed according to three dimensions: (1) accuracy of the sensors and measurement infrastructure in the experimental facilities that were used; (2) accuracy of the simulation models that were used; and (3) accuracy of the ground truth labels that indicate the presence and severity of the faults, presence or absence of faults and their severity 9 .
Facility measurement. Granderson  Simulation models. Granderson Table 11. Files and size of each file in the full dataset, as well as system of focus and provenance. *The RTU system includes three .ttl file and one inventory file.
www.nature.com/scientificdata www.nature.com/scientificdata/ For the sake of brevity, the reader is referred to these prior publications for details on facility measurement and simulation model accuracy.
Granderson et al. 9 describes a ground truth validation process that applies functional testing and engineering logic 9 . Functional testing verifies that system operation is consistent with the designed control sequences, and reflective of fault-free operational behavior. Engineering logic and the specified control sequence are combined to confirm that the data trends do indeed reflect the behaviors of the fault free and faulted scenarios. Figure 6 provides a few examples for the fan coil unit system. First the data trends are inspected to confirm that the system is operated according to the defined schedule of occupied hours corresponding to 6:00-17:59, and to the defined setpoints specified in the sequence (as shown in section System configurations and control sequences). This is verified in the profile of the cooling setpoint and heating setpoint trends, which respectively modulate from 85 °F to 72 °F, and from 55 °F to 68 °F, and back at the 6:00 and 17:59 timestamps. Next, the data trends are inspected to verify that the modeled PID parameters for the cooling valve controller are configured to output proper control signals. This is confirmed through smooth trend and absence of any significant oscillations in the plotted signal for the cooling coil valve command. Finally, inspection of the zone temperature trend confirms that the control objective, i.e., a cooling setpoint of 72 °F, was maintained throughout the occupied period of operation.
Following verification of the fault-free operational state, additional tests were conducted for each of the faulted scenarios. These tests considered (a) whether the imposed fault condition was correctly reflected in the data, and (b) whether the anticipated symptoms of the fault were reflected in other operational trends. Figure 7 illustrates these two types of tests for the FCU system fault -zone air temperature sensor bias of +2 °C (3.6 °F). The biased condition is confirmed by comparing the 2 °C offset between the data trend from the 'spoofed' faulted model output point (solid line), and the unaltered output point (dashed line). This is clearly discernible and annotated in the righthand portion of the plot. The symptoms of this bias are observed in comparing the cooling coil valve position in the faulted case (the black solid line) to that from the unfaulted case (dashed black line). The position in the faulted case is significantly higher because the controller was attempting to provide an increased amount of cooling commensurate with the erroneously high zone air temperature reading. Figure 8 illustrates another FCU system fault -cooling coil valve stuck at 20%, imposed during the cooling season. Here, the faulted condition is confirmed by observing that the valve position signal (black solid line) is fixed at 0.2, while the valve command signal (black dashed line) is adjusted. The symptom of this fault is that the zone temperature (purple solid line) significantly exceeded the 72 °F cooling setpoint during the occupied hours even though the cooling coil valve control signal (black dashed line) reached a value of 1 (i.e., 100% position) in the controller's attempt to provide maximum cooling.
Similar verification tests steps were performed for each fault type at each intensity level in experimental data sets and simulation data sets. For the simulated data sets that spanned a full year of operation, a sample of at least three days was selected for inspection from each of three operational seasons -summer/cooling season, winter/ heating season and a transitional/swing season. This sampling enabled validation of the data and faulted system behaviors under different weather conditions and operational modes 38 .

Usage Notes
A complete inventory of the data was developed to support users in interpreting the content and form of the data, and the corresponding HVAC systems, controls, and faults. The data itself comprise time series that can be analyzed with whatever software tools the user elects to implement. The data are provided at 1-minute intervals, and can be resampled as needed to fit the needs of specific applications.